47 research outputs found

    Floating-Point Matrix Product on FPGA

    Get PDF
    This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.---- Copyright IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE

    A general framework for efficient FPGA implementation of matrix product

    Get PDF
    Original article can be found at: http://www.medjcn.com/ Copyright Softmotor LimitedHigh performance systems are required by the developers for fast processing of computationally intensive applications. Reconfigurable hardware devices in the form of Filed-Programmable Gate Arrays (FPGAs) have been proposed as viable system building blocks in the construction of high performance systems at an economical price. Given the importance and the use of matrix algorithms in scientific computing applications, they seem ideal candidates to harness and exploit the advantages offered by FPGAs. In this paper, a system for matrix algorithm cores generation is described. The system provides a catalog of efficient user-customizable cores, designed for FPGA implementation, ranging in three different matrix algorithm categories: (i) matrix operations, (ii) matrix transforms and (iii) matrix decomposition. The generated core can be either a general purpose or a specific application core. The methodology used in the design and implementation of two specific image processing application cores is presented. The first core is a fully pipelined matrix multiplier for colour space conversion based on distributed arithmetic principles while the second one is a parallel floating-point matrix multiplier designed for 3D affine transformations.Peer reviewe

    Using thermochromism to simulate blood oxygenation in extracorporeal membrane oxygenation

    Get PDF
    Introduction: Extracorporeal membrane oxygenation (ECMO) training programs employ real ECMO components, causing them to be extremely expensive while offering little realism in terms of blood oxygenation and pressure. To overcome those limitations, we are developing a standalone modular ECMO simulator that reproduces ECMO’s visual, audio and haptic cues using affordable mechanisms. We present a central component of this simulator, capable of visually reproducing blood oxygenation color change using thermochromism. Methods: Our simulated ECMO circuit consists of two physically distant modules, responsible for adding and withdrawing heat from a thermochromic fluid. This manipulation of heat creates a temperature difference between the fluid in the drainage line and the fluid in the return line of the circuit and, hence, a color difference. Results: Thermochromic ink mixed with concentrated dyes was used to create a recipe for a realistic and affordable blood-colored fluid. The implemented “ECMO circuit” reproduced blood’s oxygenation and deoxygenation color difference or lack thereof. The heat control circuit costs 300 USD to build and the thermochromic fluid costs 40 USD/L. During a ten-hour in situ demonstration, nineteen ECMO specialists rated the fidelity of the oxygenated and deoxygenated “blood” and the color contrast between them as highly realistic. Conclusions: Using low-cost yet high-fidelity simulation mechanisms, we implemented the central subsystem of our modular ECMO simulator, which creates the look and feel of an ECMO circuit without using an actual one.Peer reviewedFinal Accepted Versio

    Real-time ECG Monitoring using Compressive sensing on a Heterogeneous Multicore Edge-Device

    Get PDF
    The file attached to this record is the author's final peer reviewed version. The Publisher's final version can be found by following the DOI link.In a typical ambulatory health monitoring systems, wearable medical sensors are deployed on the human body to continuously collect and transmit physiological signals to a nearby gateway that forward the measured data to the cloud-based healthcare platform. However, this model often fails to respect the strict requirements of healthcare systems. Wearable medical sensors are very limited in terms of battery lifetime, in addition, the system reliance on a cloud makes it vulnerable to connectivity and latency issues. Compressive sensing (CS) theory has been widely deployed in electrocardiogramme ECG monitoring application to optimize the wearable sensors power consumption. The proposed solution in this paper aims to tackle these limitations by empowering a gatewaycentric connected health solution, where the most power consuming tasks are performed locally on a multicore processor. This paper explores the efficiency of real-time CS-based recovery of ECG signals on an IoT-gateway embedded with ARM’s big.littleTM multicore for different signal dimension and allocated computational resources. Experimental results show that the gateway is able to reconstruct ECG signals in real-time. Moreover, it demonstrates that using a high number of cores speeds up the execution time and it further optimizes energy consumption. The paper identifies the best configurations of resource allocation that provides the optimal performance. The paper concludes that multicore processors have the computational capacity and energy efficiency to promote gateway-centric solution rather than cloud-centric platforms

    Robust event-based non-intrusive appliance recognition using multi-scale wavelet packet tree and ensemble bagging tree

    Get PDF
    open access articleProviding the user with appliance-level consumption data is the core of each energy efficiency system. To that end, non-intrusive load monitoring is employed for extracting appliance specific consumption data at a low cost without the need of installing separate submeters for each electrical device. In this context, we propose in this paper a novel non-intrusive appliance recognition system based on (i) detecting events in the aggregated power signal using a novel and powerful scheme, (ii) applying multiscale wavelet packet tree to collect comprehensive energy consumption features, and (iii) adopting an ensemble bagging tree classifier along with comparing its performance with various machine learning schemes. Moreover, to validate the proposed model, an empirical investigation is conducted on two real and public energy consumption datasets, namely, the GREEND and REDD, in which consumption readings are collected at low-frequencies. In addition, a comprehensive review of recent non-intrusive load monitoring approaches has been conducted and presented, in which their characteristics, performances and limitations are described. The proposed non-intrusive load monitoring system shows a high appliance recognition performance in terms of the accuracy, F1 score and low time complexity when it has been applied to different households from the GREEND and REDD repositories, in which every house includes various domestic appliances. Obtained results have described, e.g., that average accuracies of 97.01% and 96.36% have been reached on the GREEND and REDD datasets, respectively, which outperformed almost existing solutions considered in this framework

    Intelligent co-operative processor-in-memory

    Get PDF
    Original article can be found at: http://www.medjec.com/ Copyright Softmotor LimitedAdvances in VLSI technology are enabling the processor-memory integration to bridge the processor-memory performance gap. It is also a key driver in the innovation of a new concept called Processor-In-Memory (PIM). The work described in this paper capitalises on the extensive work carried out on PIMs in general and develops a road map for an intelligent revision of a PIM architecture referred to as Co-operative Intelligent Memory (CIM). The journey made to reach the goal of achieving a CIM is taken via the route of developing a Cooperative Pseudo Intelligent Memory (CPIM), as proof of concept and mid point in the ratification of the intelligence needed for a full CIM implementation. Both architectures use a hierarchical two level CPU structure referred to as major and minor CPUs. By partitioning computation through dividing workload between major and minor CPUs in an intelligent manner and without any pre-processor compilation or kernel task scheduling, the PIM system can be made more efficient and co-operative for class of tasks, which are heavily reliant on memory-to-memory iterative processes. The proposed architectures exploit the key feature in the iterative process by using vectors that characterize the iteration. The process of identifying intelligently these vectors is described in this paper. In addition, the performance of the proposed architectures has been evaluated.Peer reviewe

    Dynamic Co-operative Intelligent Memory

    Get PDF
    “This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder." “Copyright IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.

    xDSL Network Upgrade Employing FPGAs

    Get PDF
    “This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder." “Copyright IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.” DOI: 10.1109/DELTA.2008.84This paper proposes an upgrade scenario for xDSL networks to provide broadband access for extendedlink lengths while demonstrating network grooming bymeans of more than one subscriber using a single network connection concurrently. This is achieved by applying Direct Spread Code Division Multiple Access (DS-CDMA) in a Fiber-to-the-Cabinet (FTTC) topology by means of a Field Programmable Gate Array (FPGA), used to demonstrate simultaneous user transmission and System-on-Chip (SoC) network element generation. Experimental results have displayed efficiency in ADSL link rates of 66% and 41% for back-to-back and 12km-reach fiber links respectively

    License plate localisation based on morphological operations

    Get PDF
    “This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder." “Copyright IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.”Automatic Number Plate Recognition (ANPR) systems allow users to track, identify and monitor moving vehicles by automatically extracting their number plates. This paper presents an improved method to locate car plates in an ANPR system. The proposed method is based on morphological open and close operations where different Structuring Elements (SE) are used to maximally eliminate non-plate region and enhance plate region. This method has been tested using a database of UK number plates and results achieved have shown significant improvements in terms of the detection rate compare to other existing plate localisation systems

    Comparison of Real-Time DSP-Based Edge Detection Techniques for License Plate Detection

    Get PDF
    "This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder." “Copyright IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.”In this paper, edge detection techniques and their performance are compared when applied in license plate detection using an embedded digital signal processor. License plate detection remains to be the crucial part of a vehicle’s license plate recognition process. The edge detection algorithms compared in this work are those reported capable of delivering real-time performance. These are Canny-Deriche-FGL, Haar and Daubechies-4 wavelet transform and the classic Sobel. These particular algorithms are chosen and compared due to their good performance on digital signal processors. The comparison is drawn in terms of speed and detection success of a license plate. The results show Haar wavelet-based edge detector performs better on a DSP with LP detection speed of 7.32 ms and 98.6% success using 45,032 UK images containing license plates at 768X288 resolutions
    corecore